A Possible Origin of Power-Law Behavior in n-Tuple Zipf Analysis

نویسندگان

  • András Czirók
  • H. Eugene Stanley
  • Tamás Vicsek
چکیده

In n-tuple Zipf analysis, ‘‘words’’ are defined as strings of n digits, and their normalized frequency of occurrence v is measured for a given ‘‘text’’ ~sequence of digits!. In the case of various non-Markovian sequences, the probability density of the frequencies P(v) has a power-law tail. Here we argue that a broad class of unbiased binary texts exhibiting a nonexponential distribution of cluster sizes can indeed yield a power-law behavior of P(v), where we define clusters to be strings of identical digits. We support this result by numerical studies of long-range correlated sequences generated by three different methods that result in nonexponential cluster-size distribution: inverse Fourier transformation, Lévy walks, and the expansionmodification system. Our calculations shed light on the possible connection between the Zipf plot and the non-Markovian nature of the text: as the long-range correlations become dominant, the probability of the appearance of long clusters is increased, leading to the observed ‘‘scaling’’ in the Zipf plot. @S1063651X~96!08006-3#

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

N-tuple Zipf Analysis and Modeling for Language, Computer Program and DNA

n-tuple power law widely exists in language, computer program code, DNA and music. After a vast amount of Zipf analyses of n-tuple power law from empirical data, we propose a model to explain the n-tuple power law feature existed in these information translational carriers. Our model is a preferential selection approach inspired by Simon’s model which explained scaling law of single symbol in a...

متن کامل

Comment on "Linguistic features of noncoding DNA sequences"

In a recent letter [1], Mantegna et. al. report that certain statistical signatures of natural language can be found in non-coding DNA sequences. The vast majority of DNA in higher organisms including humans consists of non-coding sequences whose function , if any, is unknown. Hence this new analysis is quite important. It suggests, as the authors concluded , " the possible existence of one (or...

متن کامل

Moment Analysis and Zipf Law

The moment analysis method and nuclear Zipf’s law of fragment size distributions are reviewed to study nuclear disassembly. In this report, we present a compilation of both theoretical and experimental studies on moment analysis and Zipf law performed so far. The relationship of both methods to a possible critical behavior or phase transition of nuclear disassembly is discussed. In addition, sc...

متن کامل

Strong, Weak and False Inverse Power Laws

Pareto, Zipf and numerous subsequent investigators of inverse power distributions have often represented their findings as though their data conformed to a power law form for all ranges of the variable of interest. I refer to this ideal case as a strong inverse power law (SIPL). However, many of the examples used by Pareto and Zipf, as well as others who have followed them, have been truncated ...

متن کامل

How Popular is Your Paper? An Empirical Study of the Citation Distribution

Numerical data for the distribution of citations are examined for: (i) papers published in 1981 in journals which are catalogued by the Institute for Scientific Information (783,339 papers) and (ii) 20 years of publications in Physical Review D, vols. 11-50 (24,296 papers). A Zipf plot of the number of citations to a given paper versus its citation rank appears to be consistent with a power-law...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics

دوره 53 6  شماره 

صفحات  -

تاریخ انتشار 1996